Skip to content

feat(core): auto-discover test cases from directory structure#1142

Merged
christso merged 3 commits intomainfrom
feat/1141-directory-discovery
Apr 17, 2026
Merged

feat(core): auto-discover test cases from directory structure#1142
christso merged 3 commits intomainfrom
feat/1141-directory-discovery

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Closes #1141

Summary

  • When tests: resolves to a directory, scan subdirectories for case.yaml/case.yml files
  • Directory name becomes the test id unless the case file specifies one
  • A workspace/ subdirectory in the case directory auto-sets workspace.template
  • Subdirectories without case.yaml are skipped with a warning
  • Cases are sorted alphabetically for deterministic ordering

Changes

File Change
packages/core/src/evaluation/loaders/case-file-loader.ts New loadCasesFromDirectory() function
packages/core/src/evaluation/yaml-parser.ts Directory detection in string path branch
packages/core/test/evaluation/loaders/case-file-loader.test.ts 9 unit tests + 1 integration test
examples/showcase/directory-discovery/ Showcase example
apps/web/src/content/docs/docs/evaluation/eval-files.mdx Documentation for directory-based discovery

Test plan

  • All 2253 existing tests pass (zero regressions)
  • 10 new tests covering: happy path, id injection, id precedence, skip warning, alphabetical order, workspace template, explicit workspace, empty dir, .yml extension, integration via loadTestSuite
  • validate:examples passes (56/56)
  • --dry-run with showcase example discovers both cases correctly
  • Pre-push hooks pass (build, typecheck, lint, test, validate)

Dry-run evidence

$ bun apps/cli/src/cli.ts eval examples/showcase/directory-discovery/EVAL.yaml --dry-run
Using target: default → default-dry-run
0/2   🔄 add-greeting | default → default-dry-run
0/2   🔄 fix-null-check | default → default-dry-run
...
RESULT: FAIL  (0/2 scored >= 80%, mean: 0%)  # expected with --dry-run

Both cases discovered from directory structure with correct ids.

🤖 Generated with Claude Code

christso and others added 2 commits April 17, 2026 13:28
When `tests:` points to a directory, scan subdirectories for `case.yaml`
files. Directory name becomes the test `id` unless overridden. A
`workspace/` subdirectory auto-sets the workspace template.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 17, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: d103697
Status: ✅  Deploy successful!
Preview URL: https://7f2394ca.agentv.pages.dev
Branch Preview URL: https://feat-1141-directory-discover.agentv.pages.dev

View logs

- Update eval-validator to recognize directory paths (no false warning)
- Use lexicographic sort instead of locale-dependent localeCompare
- Use strict null check for id injection (not falsy check)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@christso
Copy link
Copy Markdown
Collaborator Author

Manual E2E Verification (Red/Green UAT)

RED — main branch

$ bun apps/cli/src/cli.ts eval /tmp/e2e-dir-discovery/EVAL.yaml --dry-run
Error: Cannot read external test file: /tmp/e2e-dir-discovery/cases
  EISDIR: illegal operation on a directory, read
exit code: 1

tests: ./cases/ pointing to a directory fails with EISDIR on main.

GREEN — feature branch

$ bun apps/cli/src/cli.ts eval /tmp/e2e-dir-discovery/EVAL.yaml --dry-run
Using target: default → default-dry-run
0/2   🔄 add-greeting | default → default-dry-run
0/2   🔄 fix-null-check | default → default-dry-run
1/2   ⚠️ add-greeting | default → default-dry-run | 0% FAIL
2/2   ⚠️ fix-null-check | default → default-dry-run | 0% FAIL
RESULT: FAIL  (0/2 scored >= 80%, mean: 0%)

Both cases discovered from directory structure. 0% scores expected with --dry-run.

Backward Compatibility

Inline tests and tests: ./file.yaml still work:

--- backward compat: inline tests ---
1/1   ⚠️ inline-test | default → default-dry-run | 0% FAIL
RESULT: FAIL  (0/1 scored >= 80%, mean: 0%)
--- backward compat: file path tests ---
1/1   ⚠️ file-test | default → default-dry-run | 0% FAIL
RESULT: FAIL  (0/1 scored >= 80%, mean: 0%)

All existing test resolution modes unaffected.

@christso christso marked this pull request as ready for review April 17, 2026 13:58
@christso christso merged commit 2df533a into main Apr 17, 2026
4 checks passed
@christso christso deleted the feat/1141-directory-discovery branch April 17, 2026 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: auto-discover test cases from directory structure

1 participant